skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Huang, Yuan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available December 1, 2025
  2. Abstract A fundamental problem in functional data analysis is to classify a functional observation based on training data. The application of functional data classification has gained immense popularity and utility across a wide array of disciplines, encompassing biology, engineering, environmental science, medical science, neurology, social science, and beyond. The phenomenal growth of the application of functional data classification indicates the urgent need for a systematic approach to develop efficient classification methods and scalable algorithmic implementations. Therefore, we here conduct a comprehensive review of classification methods for functional data. The review aims to bridge the gap between the functional data analysis community and the machine learning community, and to intrigue new principles for functional data classification. This article is categorized under:Statistical Learning and Exploratory Methods of the Data Sciences > Clustering and ClassificationStatistical Models > Classification ModelsData: Types and Structure > Time Series, Stochastic Processes, and Functional Data 
    more » « less
  3. Summary Cancer is a heterogeneous disease. Finite mixture of regression (FMR)—as an important heterogeneity analysis technique when an outcome variable is present—has been extensively employed in cancer research, revealing important differences in the associations between a cancer outcome/phenotype and covariates. Cancer FMR analysis has been based on clinical, demographic, and omics variables. A relatively recent and alternative source of data comes from histopathological images. Histopathological images have been long used for cancer diagnosis and staging. Recently, it has been shown that high-dimensional histopathological image features, which are extracted using automated digital image processing pipelines, are effective for modeling cancer outcomes/phenotypes. Histopathological imaging–environment interaction analysis has been further developed to expand the scope of cancer modeling and histopathological imaging-based analysis. Motivated by the significance of cancer FMR analysis and a still strong demand for more effective methods, in this article, we take the natural next step and conduct cancer FMR analysis based on models that incorporate low-dimensional clinical/demographic/environmental variables, high-dimensional imaging features, as well as their interactions. Complementary to many of the existing studies, we develop a Bayesian approach for accommodating high dimensionality, screening out noises, identifying signals, and respecting the “main effects, interactions” variable selection hierarchy. An effective computational algorithm is developed, and simulation shows advantageous performance of the proposed approach. The analysis of The Cancer Genome Atlas data on lung squamous cell cancer leads to interesting findings different from the alternative approaches. 
    more » « less
  4. Abstract Gene duplication is increasingly recognized as an important mechanism for the origination of new genes, as revealed by comparative genomic analysis. However, how new duplicate genes contribute to phenotypic evolution remains largely unknown, especially in plants. Here, we identified the new gene EXOV, derived from a partial gene duplication of its parental gene EXOVL in Arabidopsis thaliana. EXOV is a species-specific gene that originated within the last 3.5 million years and shows strong signals of positive selection. Unexpectedly, RNA-sequencing analyses revealed that, despite its young age, EXOV has acquired many novel direct and indirect interactions in which the parental gene does not engage. This observation is consistent with the high, selection-driven substitution rate of its encoded protein, in contrast to the slowly evolving EXOVL, suggesting an important role for EXOV in phenotypic evolution. We observed significant differentiation of morphological changes for all phenotypes assessed in genome-edited and T-DNA insertional single mutants and in double T-DNA insertion mutants in EXOV and EXOVL. We discovered a substantial divergence of phenotypic effects by principal component analyses, suggesting neofunctionalization of the new gene. These results reveal a young gene that plays critical roles in biological processes that underlie morphological evolution in A. thaliana. 
    more » « less